Comparative Summarization via Latent Dirichlet Allocation

نویسندگان

  • Michal Campr
  • Karel Jezek
چکیده

This paper aims to explore the possibility of using Latent Dirichlet Allocation (LDA) for multi-document comparative summarization which detects the main differences in documents. The first two sections of this paper focus on the definition of comparative summarization and a brief explanation of using the LDA topic model in this context. In the last three sections, our novel method for multi-document comparative summarization using LDA is presented and also its results are compared with the results of a similar method based on Latent Semantic Analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topic Models for Comparative Summarization

This paper aims to sum up our work in the area of comparative summarization and to present our results. The focus of comparative summarization is the analysis of input documents and the creation of summaries which depict the most significant differences in them. We experiment with two well known methods – Latent Semantic Analysis and Latent Dirichlet Allocation – to obtain the latent topics of ...

متن کامل

A New Approach to Automatic Summarization by Using Latent Dirichlet Allocation in Conditional Random Field

A New Approach to Automatic Summarization by Using Latent Dirichlet Allocation in Conditional Random Field Xiaofeng Wu, Chengqing Zong (National Lab of Pattern Recognition, Institute of Automation, CAS, Beijing 100190, China) Abustract: In recent years, Latent Dirichlet Allocation(LDA) has been used more and more in Document Clustering, Classification, Segmentation, and some one has used it in ...

متن کامل

Understanding large text corpora via sparse machine learning

Sparse machine learning has recently emerged as powerful tool to obtain models of high-dimensional data with high degree of interpretability, at low computational cost. The approach has been successfully used in many areas, such as signal and image processing. This paper posits that these methods can be extremely useful in the analysis of large collections of text documents, without requiring u...

متن کامل

Obtaining Single Document Summaries Using Latent Dirichlet Allocation

In this paper, we present a novel approach that makes use of topic models based on Latent Dirichlet allocation(LDA) for generating single document summaries. Our approach is distinguished from other LDA based approaches in that we identify the summary topics which best describe a given document and only extract sentences from those paragraphs within the document which are highly correlated give...

متن کامل

The Information Extraction Systems of PRIS at Temporal Summarization Track

This paper describes the information extraction systems of PRIS at Temporal Summarization Track. The Temporal Summarization Track includes two tasks: sequential update summarization and value tracking. For the first task, we focus attention on keywords mining and sentence scoring. The system utilizes hierarchical Latent Dirichlet Allocation (LDA) to do keywords mining and score sentences with k...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013